Skip to content

Conversation

lizexu123
Copy link
Collaborator

@lizexu123 lizexu123 commented Jul 31, 2025

支持用户传入seed参数

示例用法:

1. 服务用法:

1.1 输出每次随机

import openai

ip = "0.0.0.0"
service_http_port = "13188"  # 服务配置的

client = openai.Client(base_url=f"http://{ip}:{service_http_port}/v1", api_key="EMPTY_API_KEY")

response = client.chat.completions.create(
    model="default",
    messages=[
        {"role": "user", "content": "北京天安门在哪里?"},
    ],
    temperature=1,
    stream=False,
    seed=None,(也可以不加这一行)
)

print(response.choices[0].message.content)
print("\n")

也可以使用

curl -X POST "http://10.54.104.207:13188/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "北京天安门在哪里?"}
  ]
}'

1.2 固定输出

import openai

ip = "0.0.0.0"
service_http_port = "13188"  # 服务配置的

client = openai.Client(base_url=f"http://{ip}:{service_http_port}/v1", api_key="EMPTY_API_KEY")

response = client.chat.completions.create(
    model="default",
    messages=[
        {"role": "user", "content": "北京天安门在哪里?"},
    ],
    temperature=1,
    stream=False,
    seed=1,
)

print(response.choices[0].message.content)
print("\n")

也可以使用

curl -X POST "http://10.54.104.207:13188/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "北京天安门在哪里?"}
  ],
  "seed":1
}'

```2. 离线方式

2.1 输出随机

from fastdeploy.engine.sampling_params import SamplingParams
from fastdeploy.entrypoints.llm import LLM

model_name_or_path = "Qwen/Qwen3-0.6B"

# 超参设置
sampling_params = SamplingParams(temperature=0.1)
llm = LLM(model=model_name_or_path, tensor_parallel_size=1,reasoning_parser="qwen3")
prompt = "北京天安门在哪里?"
messages = [{"role": "user", "content": prompt}]
output = llm.chat([messages],
                   sampling_params)
              
print(output)    

2.2 输出固定

from fastdeploy.engine.sampling_params import SamplingParams
from fastdeploy.entrypoints.llm import LLM

model_name_or_path = "Qwen/Qwen3-0.6B"

# 超参设置
sampling_params = SamplingParams(temperature=0.1,seed=1)
llm = LLM(model=model_name_or_path, tensor_parallel_size=1,reasoning_parser="qwen3")
prompt = "北京天安门在哪里?"
messages = [{"role": "user", "content": prompt}]
output = llm.chat([messages],
                   sampling_params)
              
print(output) 

Copy link

paddle-bot bot commented Jul 31, 2025

Thanks for your contribution!

Copy link
Collaborator

@qingqing01 qingqing01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

增加单测验证下,固定seed 后的稳定性。以及 sampling 固定 seed 的稳定性,也增加单测。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants